Phonetic unit localization in a multi-expert recognition system

نویسندگان

  • Hélène Tattegrain
  • Jean Caelen
چکیده

This paper describes an acoustic-to-phonetic decoder (APD) (based on a mixed strategy: a) bottom-up which hypothesizes the most robust information about the speech signal, b) top-down which makes some verifications about the acoustic features or about the macro-class localization on the speech signal. In this paper, only the bottom-up strategy is described. In our system, a phoneme is described as a phonetic network whose nodes are mapped onto the acoustic signal. The coarse phonetic description then uses five phonetic networks whose nodes correspond to the acoustic phases of the analyzed sound in the speech signal. These phases are extracted by automatic segmentation using different pararneters ( energy, pitch, formant frequencies, acoustic cues from an ear model). The bottom-up APD is divided into three steps: a) the first step localizes pseudo-phonetic segments ( called acoustic phases) on the signal and defines phoneme boundaries according to a macro-class description (stop consonants, fricatives, other consonants, vowels and pauses); b) context-sensitive rules are then applied in order to filter out the most improbable solutions; c) the third step Iabels the most significant phase of each phoneme by acoustic features (using Bayesian methods). In these paper, the performance is measured by the comparison between Iabels generated automatically and Iabels generated normally: for example, detection of plosive borst rates 97% while detection of occlusive phonetic network rates 94.3%. This strategy is written in Prolog II.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Connectionist Expert Approach

Artificial Neural Networks (ANNs) are widely and successfully used in speech recognition, but still many limitations are inherited to their topologies and learning style. In an attempt to overcome these limitations, we combine in a speech recognition hybrid system the pattern processing of ANNs and the logical inferencing of symbolic approaches. In particular, we are interested in the Connectio...

متن کامل

Introducing phonetically motivated information into ASR

In this paper we present an approach to introducing more phonetically motivated information into automatic speech recognition in the form of a phonetic ‘expert’. To avoid the curse of dimensionality problem, the expert information is introduced at the level of the acoustic model. Two types of experts are used, each providing discriminative information regarding groups of phonetically related ph...

متن کامل

مقایسه روش های طیفی برای شناسایی زبان گفتاری

Identifying spoken language automatically is to identify a language from the speech signal. Language identification systems can be divided into two categories, spectral-based methods and phonetic-based methods. In the former, short-time characteristics of speech spectrum are extracted as a multi-dimensional vector. The statistical model of these features is then obtained for each language. The ...

متن کامل

Phonetic study for automatic recognition of Arabic

We propose in this paper a phonetic study of standard Arabic based essentielly on the spectrographic visions of 50 sentences of the DJOUMA corpus we have constituted. This study allowed us to determine the pertinent parameters of continuous speech recognition. For the recognition part, we present the algorithms developed and the results obtained for three important questions: The segmentation o...

متن کامل

A Super Phonetic System and Multi-dialect Chinese Speech Corpus for Speech Recognition

In this paper, we describe the work on Chinese multi-dialect speech processing. Based on the phonetic analysis of ten Chinese dialects, we have created a Chinese super phonetic system for the Chinese speech recognition. To exam this phonetic system and develop Chinese dialect speech technology, we are building a multi-dialect speech corpus, which includes 10 dialect areas and 2000 speakers.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1989